Cluster analysis using different correlation coefficients

نویسندگان

  • Seong S. Chae
  • Jong-Min Kim
  • William D. Warde
چکیده

Partitioning objects into closely related groups that have different states allows to understand the underlying structure in the data set treated. Different kinds of similarity measure with clustering algorithms are commonly used to find an optimal clustering or closely akin to original clustering. Using shrinkage-based and rank-based correlation coefficients, which are known to be robust, the recovery level of six chosen clustering algorithms is evaluated using Rand’s C values. The recovery levels using weighted likelihood estimate of correlation coefficient are obtained and compared to the results from using those correlation coefficients in applying agglomerative clustering algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

s-CorrPlot: Encoding and Exploring Correlation

Figure 1: Visualizations of correlation for a dataset containing 22,000 variables. The left two images show the correlation coefficients using a heatmap, clustered with (a) average linkage, and (b) complete linkage. The visible patterns in the heatmap are highly dependent on the clustering algorithm. In (c), our novel s-CorrPlot spatially encodes correlation coefficients, highlighting very diff...

متن کامل

Using matrix of thresholding partial correlation coefficients to infer regulatory network

DNA arrays measure the expression levels for thousands of genes simultaneously under different conditions. These measurements reflect many aspects of the underlying biological processes. A method based on the matrix of thresholding partial correlation coefficients (MTPCC) is proposed for network inference from expression profiles. It includes three main parts: (1) hierarchical cluster analysis,...

متن کامل

Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (Zea mays L)

The objective of this study was to evaluate whether different similarity coefficients used with dominant markers can influence the results of cluster analysis, using eighteen inbred lines of maize from two different populations, BR-105 and BR-106. These were analyzed by AFLP and RAPD markers and eight similarity coefficients were calculated: Jaccard, Sorensen-Dice, Anderberg, Ochiai, Simple-mat...

متن کامل

بررسی تنوع مورفولوژیکی توده‌های سیاهدانه (Nigella sativa L.) با استفاده از روش‌های آماری چند متغیره

Nigella sativa L. (black cumin) belonging to the Ranunculaceae family, is one of the most important medicinal plants and wild and cultivated forms of this plant is used in Iran. Genetic diversity of 27 accessions of N .Sativa L. from different places of Iran was characterized by morphological characteristics and data was analyzed using univariate and multivariate analyses. ANOVA revealed high s...

متن کامل

Visualizing Correlation

The well-known fact that Pearson’s product-moment correlation coefficient between two variables is the cosine of the angle between the centered variable profiles suggests a way to visualize correlation. This angular representation of product-moment correlation is automatically displayed in an h-plot. Using ideas from multidimensional scaling, an alternative angular representation of correlation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006